-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XDM: Relayer and Fraud proof updates #2562
Conversation
…key fetch, and enable XDM test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense in general
|
||
Ok(Proof::Consensus { | ||
consensus_chain_mmr_proof: ConsensusChainMmrLeafProof { | ||
consensus_block_hash: finalized_block.1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field seems never used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unused because the MMR proof is self contained but I wanted to keep the hash of finalized block for light client or other who are tracking MMR roots to be able to verify the proof stateless
domains/client/relayer/src/worker.rs
Outdated
.into_iter() | ||
.filter(|(number, _)| *number <= relay_block_until) | ||
.collect(); | ||
Relayer::fetch_unprocessed_consensus_blocks_until(&client, chain_id, number, hash)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR but I'm thinking about we may not really need to track and process every block since the XDM is not extracted from the block body but instead from the state, and as long as the XDM is not responded it won't be removed from the state.
What we want is some retry in case the XDM is dropped silently in the network during relaying, and we can always retry with the latest finalized consensus block when the notification is come.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XDM will remain in the state when un responded, I'm not sure I clearly follow what you mean. As and when a consensus block is finalized, we relay the XDM from those blocks.
I realized we still need a fallback mechanism for when the message was dropped sliently. What we can do here is to run a simultaneous process that re-submits the XDM. I will handle this in seperate PR and there is an issue created for this already I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XDM will remain in the state when un responded, I'm not sure I clearly follow what you mean
I mean we don't need the aux storage for tracking whether a block is processed or not.
Since the XDM is available in the state, whenever a consensus block import/finalization notification comes, we can get the XDM from the state based on the imported/finalized block, filter already relayed XDM, and relay the XDM. If an XDM is dropped silently during propagation then when the next consensus block import/finalization notification comes we will retry to relay this XDM without needing another worker. And yeah let's handle this in a seperate PR.
// check if this domain block is already relayed | ||
if Relayer::fetch_domains_blocks_relayed_at( | ||
&domain_client, | ||
domain_id, | ||
domain_block_number, | ||
) | ||
.contains(&domain_block_hash) | ||
{ | ||
return Ok(None); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An edge case here, if the domain stops progressing (like not activity) then we only try to relay the XDM in the last confirmed domain block for once then mark the block processed and skip it afterward without retrying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. We need a fallback mechanism that is not built in yet as I have mentioned in the other comment. Will handle that fallback mechanism when we do a cleanup of the aux storage
// then fetch new messages in the block | ||
// construct proof of each message to be relayed | ||
// submit XDM as unsigned extrinsic. | ||
while let Some(block) = chain_block_import.next().await { | ||
while let Some(block) = chain_block_finalization.next().await { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC there only be one consensus block finalization notification per segment, so the finalized consensus block will not come like #1,#2,..
but instead #1000,#2000,..
cc @nazar-pc, perhaps use the block import notification with something like block_to_process = imported_block - K
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fine. Relayer does not expect the block come iincremental. We just need the latest finalized block and we process all the blocks that in between last processed to latest finalized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may cause problem, if there is one new segment per 1 hour then all relayers from any domain will idle for 1 hour then start relaying XDM at the same time, which brings additional delay and network congestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is correct, XDM is already quiet delayed due to block confirmation and we expliclty want to wait until consensus block is finalized.
notification with something like block_to_process = imported_block - K
This approach defeats the purpose of waiting for consensus block finalization but rather relying on Kdepth. I'm not sure why we moved away from it but I would like to still relay on finalization instead of block_import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree using block finalization is better in term of readability, but the consensus block finalization is coupled with archiving and differs from what is provided by substrate by default. Besides the above concern about additional delay and network congestion, also consider if an XDM is dropped silently it needs to wait for another new segment before the next retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also consider if an XDM is dropped silently it needs to wait for another new segment before the next retry.
I dont see any concern here. There will always be a retry of the XDM in the next turn and XDM will end being included in the chain. This delay is what we opted for as a base XDM using confirmation and finalization.
We expect external services to be built on top to speed up these transfers at higher incentive and is not a concern for the Subspace.
BUt feel free to propose any potential alternatives on making this approach faster and I would like to understand what can be done here better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will always be a retry of the XDM in the next turn and XDM will end being included in the chain.
The difference is one retry per block versus one retry per segment.
This delay is what we opted for as a base XDM using confirmation and finalization.
BUt feel free to propose any potential alternatives on making this approach faster and I would like to understand what can be done here better
IMO, the hard required delay is: domain block confirmation (challenge period) + consensus block confirmation (K block depth) while the consensus block finalization is: K blocks depth + blocks of one segment. Simply using the consensus block import notification + K block depth can make it faster, non-blocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the hard required delay is: domain block confirmation (challenge period) + consensus block confirmation (K block depth)
I disagree. The hard requirement for XDM v2 is safety over fastness. This means, finalization rather than using K-depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, K blocks depth is when a consensus block is confirmed and inrevertable which is what we need for XDM (and also used in many other places), while finalization is coupled with the archiving process to ensure the block is not pruned before archiving, finalization take much longer than K blocks but doesn't mean confirmed and non-finalized block is revertable. cc @nazar-pc in case my understanding is wrong.
Non-blocker for using finalization block as the additional delay is not a concern, plz resolve the conflict though.
@@ -1,21 +1,22 @@ | |||
#![warn(rust_2018_idioms)] | |||
#![deny(unused_crate_dependencies)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to remove this line. It results in following confusing error:
error: external crate `alloc` unused in `domain_client_message_relayer`: remove the dependency or add `use alloc as _;`
|
note: the lint level is defined here
--> domains/client/relayer/src/lib.rs:2:9
|
2 | #![deny(unused_crate_dependencies)]
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error: external crate `compiler_builtins` unused in `domain_client_message_relayer`: remove the dependency or add `use compiler_builtins as _;`
error: external crate `panic_unwind` unused in `domain_client_message_relayer`: remove the dependency or add `use panic_unwind as _;`
error: external crate `proc_macro` unused in `domain_client_message_relayer`: remove the dependency or add `use proc_macro as _;`
error: external crate `test` unused in `domain_client_message_relayer`: remove the dependency or add `use test as _;`
error: could not compile `domain-client-message-relayer` (lib test) due to 3 previous errors
warning: build failed, waiting for other jobs to finish...
error: could not compile `domain-client-message-relayer` (lib) due to 5 previous errors
To reproduce -Z build-std
is needed, for example:
cd domains/client/relayer
cargo clippy --all-targets -Z build-std --target x86_64-unknown-linux-gnu
I'm not sure where it comes from because there is no such things in the crate and I don't think any of the very few macros generate it either (I checked macro expansion). Might be compiler issue: rust-lang/rust#122105
relevant: #2660 |
This PR brings following updates:
Note:
The first commit is a little packed due to very invasive changes to relayer. Initial was two commits but did not make any sense keeping the changes separate and instead merged to one.
With this Cross domain messaging can be enabled for testing but there are few more issues we need to handle
Code contributor checklist: